State- and County-Level Geographic Variation in Opioid Use Disorder, Medication Treatment, and Opioid-Related Overdose Among Medicaid Enrollees

This cross-sectional study analyzes the prevalence of opioid use disorder (OUD) and rates of medication treatment and nonfatal overdose among Medicaid enrollees in the US by state and county from 2016 to 2018.


eAppendix 2. Data quality assessment and exclusion of states
We selected two sets of states based on data quality assessment for our study.
Exclusion of states for the main analysis. Our intention for the main analysis was to provide a full overview of the distribution of TAF data. To this end, we only applied minimal exclusion criteria based on data quality assessment. Specifically, we excluded the following states from our analysis that had completely missing dual eligibility or county information for at least one study year:
We also excluded one state (Illinois) from analyses related to methadone treatment or any medication for OUD treatment because the state had zero methadone claims, at least one Opioid Treatment Program (OTP) accepting Medicaid patients operating in that year (based on publicly available information about OTPs from the National Survey of Substance Abuse Treatment Services, N-SSATS), and covered methadone 2016-2018 based on our review of state policies. [1][2][3][4] Thus, we would expect to observe methadone claims in Illinois, but did not.

Exclusion of states for sensitivity analysis.
We created a more restrictive sample of states based on data quality assessment by the DQ Atlas. 5 DQ Atlas classified TAF data elements from "low concern" to "unusable" based, for data elements relevant to our analysis, on expected completeness of data elements. The exact data quality assessment depended on the type of claims. For instance, professional claims in TAF other service records should always have a procedure code, and DQ Atlas thus examined the fraction of claims with missing procedure codes to classify states' data quality for these claims. The DQ Atlas did not assess completeness of county codes, but it assessed completeness of zip codes. We categorized states' data quality with respect to completeness of county codes based on the same criteria that DQ Atlas uses for zip codes. There is currently no data quality assessment for NDC codes, so OUD medication data quality assessment was only based on HCPCS codes. To characterize heterogeneity at county and state levels, we fitted mixed-effects multi-level logistic regression models for each outcome with random effects for counties nested within states. We used logistic regression models because all outcomes were binary. The model specification was as follows: is the probability that individual experiences the outcome , and 0 is the overall mean prevalence of the outcome expressed on the logistic scale. The random effect is the state-level residual error, which represents the smoothed difference between the population-level prevalence and the prevalence in state . The random effect is nested within the state-level random effect and represents the smoothed difference between the state-level prevalence and county-level prevalence.
In the case of a continuous outcome fitted to a linear mixed-effects multi-level model, the intraclass correlation coefficient (ICC) is commonly used to characterize heterogeneity at different levels. However, this statistic is difficult to interpret in the case of logistic regression models applied to binary outcomes, because there is no estimate of individual-level residual error in these models (the variance of a binomial distribution is a function of the mean only). Alternative approaches to calculating ICC in the case of logistic regression models often depend on the overall prevalence of the outcome, so ICCs for models of outcomes with differing baseline prevalences cannot be compared. Also, the higher-level variance derived from the random effects is on the logit scale, whereas the individual-level variance is on the probability scale, so the two measures are not directly comparable.
For these reasons, we calculated the Median Odds Ratio (MOR), a commonly used approach for measuring the effect of clustering in multi-level logistic regression models. Conceptually, the MOR is derived by calculating the odds ratio for every possible combination of two of the regions contained in a random effect, where the region with the higher prevalence of the outcome is always placed in the numerator. The median of these odds ratios at that level of regional area is then defined as the MOR. The MOR can be interpreted as